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Abstract 


The land surface freeze/thaw (F/T) state plays a key role in the hydrological and carbon 
cycles and thus affects water and energy exchanges and vegetation productivity at the land 
surface. In this study, we developed an F/T assimilation algorithm for the NASA Goddard Earth 
Observing System, version 5 (GEOS-5) modeling and assimilation framework. The algorithm 
includes a newly developed observation operator that diagnoses the landscape E/T state in the 
GEOS-5 Catchment land surface model. The E/T analysis is a rule-based approach that adjusts 
Catchment model state variables in response to binary E/T observations, while also considering 
forecast and observation errors. A regional observing system simulation experiment was 
conducted using synthetically generated E/T observations. The assimilation of perfect (error- free) 
E/T observations reduced the root-mean-square errors (RMSE) of surface temperature and soil 
temperature by 0.206 °C and 0.061 °C, respectively, when compared to model estimates 
(equivalent to a relative RMSE reduction of 6.7% and 3.1%, respectively). Eor a maximum 
classification error {CEmax) of 10% in the synthetic E/T observations, the E/T assimilation 
reduced the RMSE of surface temperature and soil temperature by 0.178 °C and 0.036 °C, 
respectively. Eor CEmax=20%, the E/T assimilation still reduces the RMSE of model surface 
temperature estimates by 0.149 °C but yields no improvement over the model soil temperature 
estimates. The E/T assimilation scheme is being developed to exploit planned operational E/T 
products from the NASA Soil Moisture Active Passive (SMAP) mission. 
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1. Introduction 


Over one-third of the global land area undergoes a seasonal transition between 
predominantly frozen and non- frozen conditions each year (Kim et al. 2011). This land surface 
freeze/thaw (F/T) transition is closely linked to the timing and length of the vegetation growing 
season (e.g. Black et al. 2000; Grippa et al. 2005; Kimball et al. 2006), the seasonal evolution of 
land-atmosphere carbon dioxide exchange (Goulden et al. 1998) and the timing of seasonal 
snowmelt, soil thaw and spring flood pulses (Kimball et al. 2001; Rawlins et al. 2005; Kane et al. 
2008). The land surface F/T state thus acts as a natural on/off switch for hydrological and 
biospheric processes over northern land areas and upper elevations where seasonal frozen 
temperatures represent a significant portion of the annual cycle (Kim et al. 2011). 

Studies show that the growing season, vegetation productivity and land-atmosphere CO 2 
exchange patterns are shifting as a result of global warming (e.g. Randerson et al. 1999; Neman! 
et al. 2003). For example. Smith et al. (2004), McDonald et al. (2004) and Kimball et al. (2006) 
found consistency between these patterns and changes in seasonal F/T dynamics observed by 
satellite microwave remote sensing. Thus, for more accurate modeling and prediction of land 
surface hydrological and biospheric processes, a good representation of the landscape F/T state 
in land surface schemes is needed. Recent efforts to enhance F/T modeling through improved 
and more expansive representation of permafrost include work on the Community Land Model 
(CLM; Lawrence et al. 2008; Lawrence at al. 2012), ORCFIIDEE (Koven et al. 2009), the joint 
UK Land Environment Simulator (JUEES; Bankers et al. 2011) and the pan-Arctic Water 
Balance Model (Rawlins et al. 2013) 

Surface air temperature measurements from regional weather stations can provide an 
indication of the landscape E/T state. However, the limited coverage of global weather station 
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networks, especially at higher latitudes and elevations, severely limits the capability for global 
monitoring and the ability to capture F/T spatial and temporal patterns (Kim et al. 2011). 
Satellite observations of passive and active microwaves are well suited for characterizing the 
landscape F/T state (Frolking et al. 1999; Bateni et al. 2012; Kontu et al. 2010). Lower 
frequency (< 37 GHz) microwave observations vary significantly between frozen and thawed 
landscapes as a result of the strong sensitivity to contrasting dielectric properties. A number of 
algorithms have been developed to detect the landscape F/T state at 25 - 50 km resolution using 
brightness temperature measurements from the Advanced Microwave Scaiming Radiometer for 
the Earth Observing System (Zhao et al. 2011), the Scanning Multichannel Microwave 
Radiometer (Zuemdorfer et al. 1992), the Special Sensor Microwave Imager (Zhang et al. 2001) 
and the Soil Moisture and Ocean Salinity mission (Kontu et al. 2010). Similarly, radar 
backscatter data have been utilized in several studies for the detection of the land surface F/T 
state (Frolking et al. 1999; Kimball et al. 2001; Bartsch et al. 2011). The L-band (1.4 GHz) radar 
observations from the Soil Moisture Active Passive (SMAP) mission (to be launched in 2014) 
will provide a global classification of the F/T state at a 3 km spatial resolution and with a 3 -day 
temporal fidelity (Entekhabi et al. 2012). 

The assimilation of remotely sensed E/T retrievals into land surface models might improve 
the simulation of carbon and hydrological processes that are especially relevant during E/T 
transitions. In this study the potential of the E/T assimilation to improve estimates of land 
surface (skin) and soil temperature is investigated. To this end, an algorithm was developed for 
the assimilation of binary E/T observations into the NASA Catchment land surface model 
(Koster et al. 2000) within the NASA Goddard Earth Observing System, version 5 (GEOS-5) 
modeling and assimilation framework. The assimilation algorithm includes a newly developed 
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observation operator that diagnoses the F/T state of the Catchment model and is compatible with 
the information contained in the remotely sensed landscape F/T state at different microwave 
frequencies. The F/T analysis consists of a rule-based approach that updates Catchment model 
prognostic variables for surface and soil temperature in response to binary F/T observations and 
considers forecast and observation errors. In order to test the methodology, an observing system 
simulation experiment is conducted using synthetically generated F/T observations. The ultimate 
goal of this study is to provide a framework for the assimilation of F/T retrievals from SMAP 
into the Catchment model in the context of the SMAP Level 4 Surface and Root Zone Soil 
Moisture (L4_SM) algorithm (Reichle et al. 2012) and the SMAP Level 4 Carbon algorithm 
(Kimball et al. 2012). Future research will explore the direct assimilation of brightness 
temperature or backscatter measurements to analyze the landscape F/T state. 

2, F/T detection using remote sensing 

At microwave frequencies, the landscape dielectric constant and thus the radar backscatter 
and the emission of passive microwaves undergo large temporal changes associated with 
corresponding changes in the predominant landscape F/T state within the satellite footprint 
(Mironov et al. 2010), which makes space-borne microwave measurements well suited for global 
F/T monitoring (Kim et al. 2011). In most studies, 0 °C is considered the temperature threshold 
between the frozen and thawed states (Colliander et al. 2012). The temperature at which the F/T 
transition occurs, however, varies with the water solute concentration and shows strong 
heterogeneity across different landscape elements and within the satellite field of view. Thus, the 
0 °C threshold is only an approximation of the landscape F/T transition point. 
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The contribution of different land surface elements to the retrieved F/T index depends on the 
microwave frequency used for the F/T classification. Colliander et al. (2012) used QuickScat Ku 
band (13.4 GHz) backscatter measurements to investigate the relationship between individual 
land surface elements (e.g. soil, snow cover, and vegetation) and the aggregate landscape F/T 
state indicated by the surface backscatter. It was observed that the temperature of the soil and 
that of vegetation stems and branches are generally better indicators of Ku band F/T dynamics 
than surface air temperature, with soil temperature being a better indicator than vegetation 
temperature. Colliander et al. (2012) did not consider the effect of snow cover despite the fact 
that for their study domain the frozen condition is dominated by a snow-covered landscape. The 
rationale for their approach is the fact that the landscape thawing can be detected even under 
snow-covered conditions, as demonstrated by Kimball et al. (2004a,b) using Ku-band 
measurements from the NASA Scatterometer. Due to their longer wavelength, L-band (1.4 GHz) 
observations from SMAP should be less sensitive to snow and vegetation scattering effects under 
dry/frozen snow conditions and penetrate more deeply into the soil than Ku-band measurements. 
This increases the sensitivity of the microwave signals to the F/T state of the underlying surface 
soil layer. 

However, for wet snow the penetration depth of microwaves is drastically reduced to a few 
centimeters or less (Matzler et al. 1984). Thus, sensitivity to soil conditions is minimal under wet 
snow and the satellite signal will largely reflect snow cover conditions when a significant amount 
of wet snow is present on the surface. 
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3, F/T diagnosis using the Catchment land surface model 

This section first provides a brief description of the NASA GEOS-5 Catchment model 
(Koster et al. 2000; Duchame et al. 2000; Reichle et al. 2011; Reichle 2012), a state-of-the-art 
global land surface model. Next, an observation operator is introduced for the diagnosis of the 
landscape F/T state in the model. This observation operator is needed for the F/T analysis 
(section 4) and designed to be compatible with the information contained in remotely sensed F/T 
observations at different microwave frequencies. 

a. Catchment model overview 

The Catchment model’s basic computational unit is the hydrological catchment (or 
watershed). In each catchment, the vertical profile of soil moisture is determined by the 
equilibrium soil moisture profile from the surface to the water table and by two additional 
variables that describe deviations from the equilibrium profile in a 1-m root zone layer and in a 
2-cm surface layer, respectively. Based on soil moisture, each catchment is separated into three 
distinct and dynamically varying subareas: a saturated region, an unsaturated region and a 
wilting region. The Catchment model also includes a three-layer snow model that accounts for 
snow melting and refreezing, dynamic changes in snow density, snow insulating properties, and 
other physics relevant to the growth and ablation of the snowpack (Stieglitz 1994). 

In the snow-free portion of the catchment, the surface energy balance is computed separately 
for the saturated, unsaturated, and wilting subareas of each catchment. In each of these three 
subareas, the land surface temperature is modeled with surface temperature prognostic variables 
that are specific to the soil moisture regime {Tci for the saturated region, Tci the for unsaturated 
region and Tc 4 for wilting region). For tropical forest land tiles, the Tci, Tci and Tc 4 fields are tied 
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to approximately the top 5 cm of soil, whereas for all other tiles the effective soil depth associated 
with these variables is negligible (Reichle 2012). The area- weighted average of the three 
prognostic surface temperature variables determines the surface temperature in the absence of 
snow, which is then averaged (again area-weighted) with the surface snow temperature, 

to provide the land surface temperature of the entire catchment: 

T^ = (\-asmw)T" Hosnow)T- (1) 

The surface snow temperature and the snow area fraction {asnow) are themselves diagnosed 
from the model’s snow prognostic variables (snow water equivalent, snow depth, and snow heat 
content). 

Subsurface temperatures are modeled using a soil heat diffusion model that consists of six 
layers. The thicknesses of the layers are about 10, 20, 40, 75, 150, and 1,000 cm starting from the 
top-most soil temperature layer. The layer thicknesses are the same for all land tiles. (For 
tropical forests, the layers of the heat diffusion model are shifted downward by the 5 cm 
thickness of the surface layer; see above.) The prognostic variables for the heat diffusion model 
are the ground heat contents {ght) in the six layers from which the soil temperatures ( 7^^) in 

each layer are diagnosed. For the remainder of this paper, ght and 7^^ refer to the values in the 
top-most (10 cm thick) soil layer only. 

b. Freeze/thaw state in the Catchment model 
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The F/T analysis (section 4) requires diagnosing the landscape F/T state of the Catchment 
model based on its prognostic variables. As outlined in section 2, the landscape F/T state 
observed by L-band microwave remote sensing is assumed to be primarily related to the near- 
surface soil and vegetation canopy temperature under dry/frozen snow condition. Under wet 
snow, however, the satellite F/T signal will largely reflect snow cover conditions. We therefore 
first define an effective temperature that vertically averages the (snow-free) portion of the 
surface temperature, and the top-layer soil temperature 7^^. 

= ( 2 ) 

Given the wavelengths used for F/T remote sensing, which typically range from 1 cm to 20 cm, 
and the resulting penetration depths, the contribution of the lower-layer soil temperatures to the 
microwave signal is small and neglected here. The parameter a determines the relative 
contributions of the surface temperature and the soil temperature and can be adjusted according 
to the microwave frequency used for the F/T classification so that it better reflects sensor signal 
penetration depth. Besides the (snow-free) effective temperature, T^, additional information on 

the landscape F/T state is contained in the modeled snow conditions. Here, the snow cover area 
fraction, asnow, is most relevant. In the Catchment model, the snow cover fraction increases 
linearly with the snow water equivalent (SWE) during the accumulation phase and reaches full 
cover {asnow=\00%) when the total amount of SWE accumulated over the catchment reaches a 
model constant of WEMIN=26 kg m'^ (Reichle et ah, 2011). 

The landscape E/T state is then diagnosed from the Catchment model variables via the 
following observation operator, which is also illustrated in Eigure 1; 
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Thawed (F/T=l) if and asnow <asnoWj^^ 

( 3 ) 

Frozen (F/T=-l) if or asnow >asnow^^i„i^ 

The effeetive temperature that determines the transition between frozen and thawed 
eonditions is T^ff_ff^gshoid ^ • ^"he snow eover threshold value asnow determines the 

maximum modeled snow cover fraction that is still compatible with a thawed condition. This 
value is fixed at 10% in this study and depends on the microwave frequency and the associated 
penetration depth through snow. The penetration depth at C-band (5.6 GHz) can be as large as 
several meters in dry snow conditions (Bingham and Drinkwater 2000, Dali et al. 2001) and is 
likely even larger at L-band (1.27 GHz; Rignot et al. 2001). For wet snow, however, the 
penetration depth of microwaves is drastically reduced to a few centimeters or less (Matzler et ah, 
1984). 

4, F/T data assimilation module (F/T analysis) 

The assimilation of F/T observations is conceptually similar to the assimilation of snow 
cover observations. In both cases, the observed variable is, at least at a finer spatial scale, 
essentially a binary observation. Binary observations cannot be assimilated with a Kalman filter, 
because this requires continuous variables. For the assimilation of F/T observations, we propose 
a rule-based assimilation approach, similar to the rule-based assimilation of binary snow cover 
observations (Rodell and Houser 2004). In short, if the model forecast and the corresponding 
SMAP observations disagree on the F/T state, that is, if the model indicates frozen conditions 
and observation indicates thawed conditions (or vice versa), the model prognostic variables 
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related to the soil temperature ( 7^^) and the snow-free surfaee temperature ( are 

adjusted to match the observed F/T condition more closely. To account for model and 
observation errors, the delineation between frozen and thawed regimes is defined with some 
uncertainty in the assimilation algorithm, as will be detailed below. 

a. Uncertainty in F/T simulations and observations 

The perhaps simplest F/T analysis could use the observation operator defined in Equation (3) 
to determine the F/T state of the model forecast and then apply increments to switch the model’s 
F/T state whenever the model’s F/T state differs from that of the observations. However, such an 
analysis would ignore any uncertainty associated with the formulation of the observation 
operator (Equation (3)). It would also ignore any errors in the observations themselves. 

Eor the purpose of the E/T analysis, we therefore refine the observation operator by 
introducing a regime of undetermined E/T status, which is defined by upper and lower bounds 
for the effective temperature and snow cover thresholds, as illustrated in Eigure 2. Specifically, 
the model E/T state for the purpose of the E/T analysis is; 

Completely Thawed (E/T=l) if Te/f> UB_Te/f and asnow < LB _asnow 

Completely Erozen (E/T=-l) if Te/f< LB_Te/f or asnow > UB_asnow (4) 

Undetermined (E/T=0) otherwise 

In this study, UB Jeff and LB T^^ are fixed at -1 C and ~t“l C, and LB asnow is set to 5*3o. A 
value of 100% was chosen for UB_asnow. This assigns an “undetermined” E/T regime to 
situations with considerable snow cover on soil that is thawed or close to thawing. Under these 
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circumstances, it is difficult to determine whether the model F/T state should be thawed or frozen 


in a manner that would be fully consistent with the retrieval algorithm that was used to determine 
the value of the F/T observation. 

The “undetermined” regime impacts the computation of the inerements in two ways. Firstly, 
if the model foreeast F/T state is “undetermined”, no increments will be applied. Seeondly, the 
upper and lower bounds for the effective temperature threshold {UB Jeff, LB_Teff) will be used to 
formulate the rule-based increments that result from the F/T analysis (seetion 4b). In either case, 
the “undetermined” regime implieitly assigns weight to the model foreeast in the analysis update 
and thus assumes imperfect observations. 

b. Update rules 

The assimilation of F/T observations is based on a number of rules. No updates are 
performed (i) if both the model and the observations agree on the F/T state, or (ii) if the model 
F/T state is undetermined per Equation (4). When the observations and simulations indieate a 
contrasting F/T state, then the model prognostic variables assoeiated with Tg/f are updated (i.e., 
Tci, Tc 2 , Tc 4 , and ght; seetion 3). Speeifieally, if the observations indicate a thawed condition 
(F/T=l) whereas the model is in a frozen regime, then Tgff is increased to the lower bound 
LB_Tejf. Conversely, if the observations indieate freezing (F/T=-l) and the model is in a thawed 
regime, then Tgff is deereased to the upper bound UB Jeff- The updates can be summarized as 
follows: 


If obs F/T=-l, model F/T=l and M={UB_Tejf- Te/)<0, then T,/ = Te/ + M 
If obs F/T=l, model F/T=-l and A7= {LB_Teff- Tgff')>0, then Tgf/ = Tef/ + AT 


( 5 ) 
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In this equation, Teff' represents the a priori estimate and Teff represents the analysis. The 
same increment AT is applied to the prognostic temperature variables Tci, Tci and Tc 4 (the 
weighted average of which determines and the soil temperature, For the latter, the 

ground heat content {ght, the model prognostic variable that determines the soil temperature) is 
adjusted accordingly to match the updated soil temperature, Note that the updates to Tci, Tci 
and Tc 4 also adjust following Equation (1). In this study we are only updating the surface 

temperature and the soil temperature (and ground heat content) of the top-most soil layer. For 
future studies, updating the temperature of lower soil layers can also be considered. 

The update rules (Equation (5)) intentionally do not adjust the snow variables directly. As 
mentioned in section 4a, an upper bound of UB_asnow=\00% has been selected to avoid 
uncertainties related to the role of snow in determining the F/T state. This choice is supported by 
several experiments that were performed with smaller threshold values for UB_asnow and in 
which a portion of the snow was removed if the observed F/T state indicated thawed conditions. 
These additional experiments (not shown) indicated that (error-prone) F/T observations 
sometimes mistakenly removed the model snow, which resulted in large subsequent forecast 
errors. It is difficult to recover from such errors, because once the model snow has been 
removed, the missing snow cannot easily be re-deposited at future analysis times due to the lack 
of quantitative information about snow mass in the F/T observations. Consequently, in the 
following the snow prognostic variables are not adjusted as part of the F/T analysis update. 
Nevertheless, at later time steps the model’s snow conditions will respond to the adjusted soil 
temperatures and corresponding updated hydrological fluxes. 
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5. Synthetic twin experiment 

The twin experiment eonsists of several components. A Catchment land surface model 
integration serves as the “truth” and is used (i) to generate synthetic F/T observations and (ii) to 
validate the analysis results. The data assimilation experiment is performed with imperfect 
simulations and observations. The synthetic observed F/T state is obtained by adding 
classification error to the true F/T state (Section 5b). The imperfect Catchment land surface 
model integration is produced with a different forcing dataset to mimic forcing errors. This 
imperfect model simulation without data assimilation is referred to as the open loop (OL) (see 
discussion in section 5b). The F/T analysis is performed by assimilating the synthetic F/T 
observations into the imperfect model simulation using erroneous forcing data, and is referred to 
as the data assimilation (DA) integration. The OL and DA results are compared against the truth 
and the relative importance of assimilating observed F/T data is investigated (section 6). 

a. Study domain and time period 

The study domain is a region in North America between 45-55°N and 90-1 10°W (Figure 3). 
The simulations are performed on a 36 km Equal-Area Scalable Earth (EASE) grid, covering 
1,137 grid cells in the study domain. The Catchment model integration is conducted using the 
GEOS-5 land data assimilation system (Reichle et al. 2014) with a time step of 20 min. The 
selected period of investigation is 8 years (1 January 2002 - 1 January 2010) and the temporal 
resolution of the model output is 3 -hourly. The model was spun up by cycling ten times through 
the 1 -year period from 1 January 2001 to 1 January 2002. 
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b. Synthetic truth, synthetic observations, and open loop 

The synthetic truth is based on a Catchment model simulation that uses surface 
meteorological forcing data from the Modern-Era Retrospective analysis for Research and 
Applications (MERRA; Rienecker et al. 2011). The MERRA data product is provided at an 
hourly temporal resolution and a l/2°x2/3° (latitude/longitude) spatial resolution. The resulting 
8 years of synthetic true hydrological state variables and fluxes are used for the validation of the 
E/T analysis (DA). The synthetic true E/T state is obtained by applying the observation operator 
(Equation (3)) using a =0.5, as«oWj,^^=10%, and = 

The synthetic observed E/T indices are obtained by corrupting the true E/T data set with 
synthetic classification error. Specifically, the classification error is defined by the probability of 
misclassification. The SMAP mission requirements call for a E/T product with no more than 
20% mean spatial classification error (McDonald et al. 2012). Here, we assume that the 
classification error is greatest near 0"C, where it reaches CEmax, linearly tapers off towards 
colder and warmer temperatures and vanishes below -10°C and above +10°C. That is, the 
classification error is given by a piecewise linear function of the land surface temperature, 7^, 
as follows: 


CE 

max 


-lo^c^r^^o^c 


10 


CE 

10 

o"c<r^<io‘’c 

(6) 

max 

10 



0 


r,^>io“Cor r^<-i0"c 



This parameterization of the classification error is illustrated in Eigure 4. 
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The synthetic F/T observations are generated at each time and for each location (or grid cell) 
by obtaining the probability of misclassification based on the land surface temperature T^from 

Equation (6). We then randomly select a number from a uniform distribution between 0 and 1. If 
the selected random number is less than the specified classification error for that land surface 
temperature, then the observed F/T index is obtained by changing the sign of true F/T 
classification. Otherwise, the observed F/T index is equal to the true F/T state. The sensitivity of 
the data assimilation experiments to different levels of observation classification errors will be 
investigated below. 

The open loop data set is obtained from an integration of the Catchment model with forcing 
data that differ from those used for the truth. Forcing errors were imposed by replacing the 
MERRA surface meteorological forcing fields with data from the Global Fand Data Assimilation 
System (GFDAS; Rodell et al. 2004) as used in a former version of the NASA GMAO seasonal 
prediction system at 3 -hourly temporal resolution and at 2.0" x 2.5" (latitude/longitude) spatial 
resolution. The hydrological response associated with the differences between MERRA and 
GFDAS in precipitation and radiation timing and intensity results in considerable differences in 
the diagnosed F/T state at the grid scale. 

c. F/T assimilation setup 

The F/T assimilation experiment uses the same model settings as described for the open loop 
model, that is, it uses GFDAS forcings to mimic forcing errors relative to the MERRA truth. No 
additional perturbations are imposed and a single deterministic integration is performed for a 
period of 8 years (1 January 2002 - 1 January 2010). In this study, the synthetic observed F/T 
index is assimilated into the imperfect model integration at 6;00am and 6;00pm local time (F/T 
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analysis update). The proposed assimilation time steps are compatible with the planned overpass 
times of SMAP. 

The various tunable parameters in the diagnosis of the (uncertain) F/T state and the update rules 
are as follows. The parameter a (which determines the weight of the components of the 
effective temperature, Equation (2)) is set to 0.5 for the generation of F/T observations. This 
parameter is tunable and the sensitivity of data assimilation experiments to this parameter in the 
observation operator (Equation (3)) will be explored in section 6b. The values for the lower and 
upper bounds on the snow cover threshold [LB asnow; UB_asnow\ are 5% and 100% snow 
cover, respectively. The uncertainty range for asnow accounts for the combined uncertainty 
associated with the diagnosis of the modeled F/T state and the classification of the F/T 
observations in the presence of snow. In order to account for the uncertainty of the 0°C threshold 
value resulting from water solute concentration across different landscape elements within the 
satellite field of view, the upper and lower bounds for the effective temperature thresholds are 
+1°C and -1°C, respectively. The F/T analysis may benefit from adjusting these uncertainty 
bounds in response to the F/T classification error in the synthetic observations, but in the present 
paper we keep the bounds fixed. 

d. Validation of temperature estimates 

By design, the analysis update (Equation (5)) does not alter the F/T state of the model 
forecast, but the update rules will alter the temperature variables whenever the model forecast 
F/T state differs from the observed F/T index. It is expected that the differences in surface and 
soil temperatures (with respect to the truth) are smaller in the assimilation estimates than in the 
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open loop estimates. We therefore focus the validation on the computation of root-mean-square 
errors (RMSE) of surface and soil temperatures versus the truth data set. 

F/T data assimilation is expected to be most relevant when temperatures are near 0°C 
because it is straightforward to estimate the F/T state accurately during clearly warm or cold 
conditions. We thus limit the validation to time steps where the air temperature is above -7°C 
and below +7°C (as indicated by the MERRA surface air temperatures). Furthermore, we restrict 
the validation to 6:00 am and 6:00 pm local time only, compatible with the time of the SMAP 
overpasses. 

6, Results and discussion 

a. Open loop (OL) and data assimilation (DA) with standard settings 

To assess the impact of the imperfect forcing on the diagnosis of the F/T state without data 
assimilation, we first examine the OF results. As mentioned in section 5, the OF utilizes GFDAS 
forcings and the “truth” utilizes MERRA forcings. When compared to the truth, the OF has a 
F/T classification error of 4.85% (Table 1). The table also shows that the RMSE value for the 
OF surface temperature ( 7^) is 3.1°C and that of the first soil layer temperature ( 2^^) is 2.0°C. 

Again, by design the F/T analysis update does not alter the F/T state of the model forecast, 
and consequently the F/T classification error of the assimilation estimates is the same as that of 
the OF. But through the assimilation of the F/T observations, we hope to reduce the OF 
temperature errors. The F/T analysis involves adjusting the land surface effective temperature 
(7^^), and subsequently and if the observed and simulated F/T states do not agree. 

Table 2 summarizes the reduction in RMSE (ARMSE = RMSE OF - RMSE DA) by 
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assimilating synthetic F/T observations with 4 different levels of classification error {CEmax), and 
assuming default values for the tunable parameters, as introduced in section 5c. 

Assimilating observed F/T indices without classification error results in an RMSE 
improvement of 0.206°C for the land surface temperature (7^) and an RMSE improvement of 
0.06 1°C for the first layer soil temperature. When compared to the OL results for these two 
variables, the E/T analysis results in relative RMSE improvements of 6.7% and 3.1% for 
and , respectively. The skill improvement decreases monotonically with increasing 

classification error in the observations. Eor a maximum classification error of C£'max=20% the 
assimilation of E/T observations still reduces the surface temperature RMSE by 0.149°C but it no 
longer improves the soil temperature estimates. 

Eigure 5 shows the and T^^skill improvements in the study domain for the assimilation 
of E/T observations with CEmax=0%, 5% and 20%. Eigures 5a and 5b show that as a result of 
assimilating perfect E/T observations, the skill of and improves for almost all grid cells 

within the study domain. However, the efficiency of the E/T analysis deteriorates as the 
classification error is increased (Eigures 5c-d). Eor CEmax=20%, many grid cells in the study 
domain have negative or no improvement in 7^^ skill. As mentioned above, the E/T analysis 

may benefit from adjusting the uncertainty bounds in response to the classification error of the 
synthetic E/T observations, but the above results indicate that using a single set of uncertainty 
bounds already provides reasonable assimilation estimates. 

Eigure 6 shows the skill improvement for each grid cell binned as function of the number of 
analysis updates per grid cell (that is, the skill improvement is spatially averaged across grid cells 
experiencing a similar number of analysis updates in time within the study domain). The data 
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points are assigned to 6 bins with equal numbers of grid eells. Each bin center is assigned the 
average number of analysis updates for the grid cells in that particular bin. When more error- 
free observations (Figure 6a, b) or observations with modest classification errors (Figure 6c, d) are 
assimilated, the average skill improves with the number of analysis updates for both the 
temperatures, and . However, as the maximum classification error is increased to 20% 

(Fig 6e,f), the average skill in the temperature variables does not improve with the number of 
analyses. This is due to the negative effect of assimilating misclassified observed F/T indices 
into the model. 

b. Sensitivity of assimilation results to the formulation of the effective temperature 

The effective temperature, Teff, which is an important variable in diagnosing the F/T state, is 
a weighted average of the surface temperature in the absence of snow, and the soil 

temperature, 7^^ (Equation (2)). The weight (or) should be a function of the microwave 
penetration depth. An increase (decrease) in penetration depth results in a decrease (increase) in 
parameter a and hence an increase (decrease) in the weight of the soil temperature component of 
effective temperature Teff ■ In this study, the synthetic true F/T state was obtained based on the 
assumption that the parameter a equals 0.5. Thus, and 7^^ have similar weights in 

determining the effective temperature, Teff, and thus the F/T state of the soil. 

However, when determining the F/T index from (real) remote sensing observations, the 
relative effect of T^^^and 7^^ in those observations is not known a priori. Here we investigate 
the sensitivity of the DA performance to the choice of this factor in the observation operator. A 
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physically meaningful range of tlfbetween 0.25 and 1 was seleeted. This means that the weight 
of soil temperature, 7^^, ranges between 0.75 and 0 in the model. 

The sensitivity of the assimilation results to the value of cL in the foreeasted F/T state is 
illustrated Figure 7. The skill improvements (ARMSE) are shown for the case where no 
olassifieation error {CE,„ax=^Vo) is assoeiated with the assimilated F/T indiees. As expeeted, the 
maximum skill improvement for both and 7^^ oeeurs when the parameter a is 0.5, that is, 

when the a value that is used in the observation operator of the assimilation system matehes the 
a value that was used to generate the synthetie F/T observations. The figure shows that the 
sensitivity of to the parameter a seems to be higher than that of 7^^. The skill of T^is 

reduced by up to 50% when a is not seleeted eorrectly, while the skill is reduced by at most 8% 
for 7^^ It is thus important to understand how different land surface variables eontribute to the 

observed F/T and to mimie this relationship adequately in the F/T observation operator used in 
the data assimilation seheme. 

7, Conclusions 

In this study an algorithm for the diagnosis of the F/T state in the NASA Catchment land 
surface model was developed. The algorithm is eompatible with the information eontained in 
remotely sensed retrievals of landseape F/T state at different microwave frequeneies. The GEOS- 
5 land data assimilation system in offline mode was updated with the newly designed E/T 
assimilation module. The ultimate goal of this research is to provide a framework for the 
assimilation of SMAP (Soil Moisture Aetive Passive) E/T observations into the Catehment 
model. 
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The performance of the method for a synthetic experiment showed encouraging 
improvements in the skill of soil temperature and land surface temperature estimates. However, 
the average skill improvement depends on the classification error in the F/T observations. In our 
synthetic study, the open loop simulation has a modeled F/T classification error of 4.85% error 
compared to the truth. When assimilating perfect (error-free) F/T observations, the RMSE for 
land surface temperature (7^) and soil temperature {T^on) improves by 6.7% and 3.1%, 

respectively. Yet, the skill improvement decreases monotonically with increasing classification 
error in the assimilated F/T observations. No more improvements in soil temperature were found 
with maximum classification errors of CEmax=20%. 

The results also discuss the sensitivity of the data assimilation (DA) to the a parameter in the 
observation operator. This parameter controls the relative contribution of the snow-free surface 
temperature and the top-layer soil temperature to the F/T state in the modeling system and 
impacts the temperature increments applied during the F/T analysis. The maximum skill 
improvement can only be expected if the observation operator in the modeling system closely 
mimics the relative importance of various landscape components, including the surface and soil 
temperatures, in the determination of the satellite F/T observations. Therefore, the observation 
operator could also benefit from further tuning to improve the linkage between the modeled 
snow cover and the expected F/T index retrieved from the microwave signal. Moreover, the 
limitations of the present study could perhaps be overcome in the future by directly assimilating 
backscatter or brightness temperature observations (instead of F/T retrievals). 

The regional domain of the experiment investigated in this research represents a relatively 
fiat terrain area of central North America. In this region, the model without assimilation (open 
loop) produced a F/T classification error of only 4.85%. This modeling error is a direct result of 
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the assumption that all F/T classification errors are solely due to errors in the forcing data (as 
reflected in the difference between the GLDAS and MERRA data). When the F/T assimilation 
method is applied to satellite observations (instead of synthetic retrievals), we expect larger 
errors in the simulated F/T state, especially over regions with more complex topography (e.g., 
regions in Western North America) where global forcing fields do not resolve the considerable 
heterogeneity of the surface conditions. In applications, the benefit of assimilating high- 
resolution (3 km) SMAP F/T retrievals is therefore expected to be greater for improving the 
simulation of eco-hydrological processes. 
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692 Table 1 . Metrics for OL vs. truth estimates for a period of 8 years (2002-2010) and at 6am and 

693 6pm local time. The RMSE for and is computed excluding times and locations where 

694 > 7°C or <-TC. 

695 


Variables 

Metric 

Value 

T 

^sutf 

RMSE 

3.08 °C 

T 

^soil 

RMSE 

1.97 °C 

F/T 

Classification error 

4.85% 
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Table 2. RMSE improvement (ARMSE = RMSE OE - RMSE DA, in °C) for T^and 2^^, for 


different maximum classification errors (CEmax), excluding times and locations where 
or <-7°C, for a period of 8 years (2002-2010) and at 6am and 6pm local time. 
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Figure captions 


Figure 1 . Schematic representation of the model diagnosis of the land surface F/T state as a 
function of (snow-free) effective temperature {Tefj) and the snow cover fraction {asnow). 

Figure 2. Schematic representation of three distinct F/T state regimes defined by upper and lower 
uncertainty bounds on the effective temperature and snow cover thresholds for the purpose of the 
F/T analysis. The upper bound for the snow cover threshold is set to UB _asnow=\00%. 

Figure 3. Map of study domain. 

Figure 4. Classification error function. 

Figure 5. ARMSE (= RMSE OE - RMSE DA) in (a, c, e) and (b, d, f) 7^^ across the study 

domain for assimilation of synthetic E/T observations with (a, b) CEmax=0%, (c, d) CEmax=5%, 
and (e, f) CEmax=20%. A positive ARMSE indicates a skill improvement in the assimilation 
results. 

Eigure 6. Spatially averaged ARMSE for (a,c,e) T^and (b,d,f) T^ou ^ spatial standard 

deviation around the mean as a function of the number of analysis updates for the assimilation of 
synthetic E/T observations with (a,b) CEmax=0%, (c,d) CEmax=5%, and (e,f) CEmax=20%. A 
positive ARMSE indicates a skill improvement in the assimilation results. 
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Figure 7. ARMSE for (a) and (b) 7^^, as a function of the a parameter chosen in the 
observation operator. A positive ARMSE indicates a skill improvement in the assimilation results. 
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Figure 1 . Schematic representation of the model diagnosis of the land surface F/T state as a 
function of (snow-free) effective temperature and the snow cover fraction ( asnow). 
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Figure 2. Schematic representation of three distinct F/T state regimes defined by upper and lower 


uncertainty bounds on the effective temperature and snow cover thresholds for the purpose of the 


F/T analysis. The upper bound for the snow cover threshold is set to L/B_as'now=100%. 
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Figure 3. Map of study domain. 
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802 

803 Figure 5. ARMSE (= RMSE OE - RMSE DA) in (a, c, e) and (b, d, f) 7^^ across the study 

804 domain for assimilation of synthetic E/T observations with (a, b) CE'mox=0%, (c, d) CE^ax=5Vo, 
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805 and (e, f) C£'max=20%. A positive ARMSE indicates a skill improvement in the assimilation 

806 results. 
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810 Figure 6. Spatially averaged ARMSE for (a,c,e) T^and (b,d,f) with 1 spatial standard 

811 deviation around the mean as a function of the number of analysis updates for the assimilation of 
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positive ARMSE indicates a skill improvement in the assimilation results. 
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Figure 7. ARMSE for (a) and (b) 7^^ , as a function of the a parameter chosen in the 
observation operator. A positive ARMSE indicates a skill improvement in the assimilation results. 
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